Information Extraction: Methodologies and Applications
نویسندگان
چکیده
This chapter is concerned with the methodologies and applications of information extraction. Information is hidden in the large volume of web pages and thus it is necessary to extract useful information from the web content, called Information Extraction. In information extraction, given a sequence of instances, we identify and pull out a sub-sequence of the input that represents information we are interested in. In the past years, there was a rapid expansion of activities in the information extraction area. Many methods have been proposed for automating the process of extraction. However, due to the heterogeneity and the lack of structure of Web data, automated discovery of targeted or unexpected knowledge information still presents many challenging research problems. In this chapter, we will investigate the problems of information extraction and survey existing methodologies for solving these problems. Several real-world applications of information extraction will be introduced. Emerging challenges will be discussed.
منابع مشابه
A Review of Relation Extraction
Many applications in information extraction, natural language understanding, information retrieval require an understanding of the semantic relations between entities. We present a comprehensive review of various aspects of the entity relation extraction task. Some of the most important supervised and semi-supervised classification approaches to the relation extraction task are covered in suffi...
متن کاملESM-IL: Entity Extraction from Social Media Text for Indian Languages @ FIRE 2015 - An Overview
Entity recognition is a very important sub task of Information extraction and find its applications in information retrieval, machine translation and other higher Natural Language Processing (NLP) applications such as co-reference resolution. Entities are real world elements or objects such as Person names, Organization names, Product names, Location names. Entities are often referred to as Nam...
متن کاملPresenting a method for extracting structured domain-dependent information from Farsi Web pages
Extracting structured information about entities from web texts is an important task in web mining, natural language processing, and information extraction. Information extraction is useful in many applications including search engines, question-answering systems, recommender systems, machine translation, etc. An information extraction system aims to identify the entities from the text and extr...
متن کاملCMEE-IL: Code Mix Entity Extraction in Indian Languages from Social Media Text @ FIRE 2016 - An Overview
The penetration of smart devices such as mobile phones, tabs has significantly changed the way people communicate. This has led to the growth of usage of social media tools such as twitter, facebook chats for communication. This has led to development of new challenges and perspectives in the language technologies research. Automatic processing of such texts requires us to develop new methodolo...
متن کاملVisualization of Text Streams: A Survey
This work presents related areas of research, types of data collections that are visualized, technical aspects of generating visualizations, and evaluation methodologies. Existing methods are structured and explained from the aspect of visualization process. Successful applications are noted and some future trends in the field are anticipated. Keywords— Information Visualization, Visual Analyti...
متن کامل